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1. Introduction 

Dc and rf discharges at low pressures, such as capacitively coupled plasmas (CCP), 
inductively coupled plasmas (ICP) and magnetrons, have played critical roles as etching and 
depositing devices in semiconductor industry IT] [31, as well as in some other applications, 
such as plasma lighting, displays, healthcare and Hall thrusters ||3J. Computer simulations 
have been demonstrated as a powerful tool in this field. They can give unique insights into 
the fundamental plasma physics and reduce the workload for the industrial reactor designers 
significantly. 

There are three commonly used simulation techniques, namely, the fluid, Particle-in- 
cell(PIC), and Boltzmann model in plasma physics research ||4]|5]|6|. PIC model |7 8 1 solves 
the Newton and Maxwell equations directly. Kinetic, non-local and non-equilibrium effects 
can be included. In addition. Direct Simulation Monte Carlo (DSMC) method Q is used for 
modeling rarefied neutral gas flows, in which the mean free path of a molecule is on the order 
of (or greater than) the characteristic physical scale. Both PIC and DSMC models need to be 
coupled together to depict these discharges. This method, often referred as PIC/MC model, 
was firstly introduced at early 1990s [ T0l[m[T2ll . PIC/MC simulations for these discharges 
adopted simplified model from pure PIC model and DSMC method. In many discharging 
systems, electrostatic or Darwin modeling are sufficient. At the same time, because the plasma 
density and gas pressure is low, the neutral molecules are in their own thermal equilibrium, 
and the Coulomb collision rate is relatively low. Therefore one needs only to consider the 
collisions between the charged particles and the neutral molecules. Null collision method had 
been proved to be an effective method to treat these MC processes, in which one need not to 
scan all the particles, as it often does in DSMC simulations. 

The main difficulties of PIC/MC simulations are the costs of computational resources. 
In conventional electrostatic PIC simulations, the spatial and temporal steps must be chosen 
to resolve the fastest temporal and finest spatial behavior of the electrons, namely. Ax < Ao, 
iOpht < 2 and Ax/Af < v,, where v, is the electron thermal velocity. This would require 
hundreds to thousands cells per cm and the time step of 10"'^ ~ lO^'^s in these discharge 
modeling. On the other hand, typically more than 100 particles per cell are needed to get 
rid of the stochastic errors in MC process lfT3l . Since the computational costs are very 
high, most PIC/MC simulations were only done in ID geometry up to now. Conventional 
2D or 3D simulations are only possible for some cases where the densities are relatively 
low lfT4l [161 WT\ . For most higher density cases in CCP, PIC/MC simulations of practical 
interesting systems often run on supercomputers iTSl [161 [TSl . To overcome this problem, 
some fast but non-standard algorithms were proposed, such as the global PIC/MC method 
|[T9ll and fluid PIC |[20l [2T|| . However, they made additional approximations and may not 
be sufficient for investigating of some kinetics effects, for example, when the distribution 
functions of electrons are anisotropic |22|. 

Fortunately, the fastest phenomena are usually not very important in these discharges. 
As the fastest oscillation modes of electrons are not very important here, we only need solve 
rf frequency lOrf plus some harmonics. By damping some high frequency modes, implicit 
PIC/MC method can eliminate the major constrains of grid spacing and time step on explicit 
PIC codes, while most of the kinetics effects are still kept, and thus it could be a better 
approach to these problems. However, for implicit model, more complicated algorithms 
must be introduced, and the numerical schemes should be carefully treated, especially when 
coupling with MC model and in axisymmetric geometry. Here are the main difficulties origin 
from the cylindrical geometry listed below: (1) weighted particles resizing; (2) implicit 
particle pushing in cylindrical geometry; and (3) Poisson solver and its parallelization in 
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cylindrical geometry, especially when R :» L, where R is the device radius and the L is 
the electrode spacing. There are many solutions to them for the explicit simulations. For 
example, Nanbu [14l[l5l|23l|24ll23l26llI3l28l has solved neai'ly all essential problems 
associated with the MC method and the axisymmetric geometry for explicit algorithms by 
developing some new methods and introducing some methods from DSMC model. But for 
the implicit simulations with MC model, subtle difficulties exist still. 

This is the first of our two serial papers. In this paper, we report a direct implicit and 
electrostatic PIC/MC simulation model for CCP in two-dimensional axisymmetric geometry. 
Although our model is designed for CCP, it can be used to study some other problems, such 
as dc and atmosphere discharges. This work would not have been possible without those 
who have developed the PIC and DSMC method to their present advanced state. We should 
emphasize that most of the numerical techniques here have been developed by many previous 
researchers, many of which have histories of more than ten years. We just try to incorporate 
these algorithms together, then analyze and compare all possible numerical treatments to meet 
the specific requirements. The PIC algorithms mainly follow up the works by Birdsall Il8i il21 . 
Langdon and Cohen l29l [30l ED, Verboncoeur |[321 [3ll [HI, Vahedi l35] |36l |33, Hewett 
Il38l [39l and Friedman 140]. The MC algorithms mainly follow up the works by Bird f9l, 
Nanbu p25| and Vahedi 141]. We will discuss these numerical schemes in Sec. 2. Simulation 
results and benchmarks are given in Sec. 3. Discussions and a brief summary are presented in 
Sec.4. 

2. Numerical Methods 

2.1. Direct Implicit PIC Simulation 

In the Implicit Particle-In-Cell (IPIC) schemes, the field in the next step E"^^ which depends 
on the future charge density p"^' must be known at n step. There are two kinds of algorithms, 
namely, direct implicit simulation (DIPIC) l|29l [30l [38l |40|| and implicit movement method 
simulation (IMMPIC) fill |43l |44l [33 . In DIPIC, the field equations ai-e derived from direct 
summation and extrapolation of the particles moving equations. In the implicit movement 
method, the field equations are derived from the Vlasov movement equations. 

Here we applied the direct implicit simulation method which was proposed by 
Langdon ||29l and Friedman |40|. In brief, in "Dl" electrostatic DIPIC algorithm, the particle 
pushing procedure is divided into "firstpush": 




and "finalpush": 




where 



2 



(1) 



Between the two pushing procedures, electric field in time f"^' is solved by 



(2) 
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where p"^' denotes the charge density contributed by and 




V 



The denotes summation over all species particles. 

After the pushing and field solving procedures are executed, the MC procedure is 
executed. 

In summary, the simulation cycle consists of the following steps: (1) first pushing; (2) 
weighting; (3) solve the field equation; (4) final pushing; and (5) MC process. The first 
pushing in (« + 1),/, cycle and final pushing in n,;, cycle can be merged into one procedure, so 
only one passing through the particles is needed, which will improve the code efliciency. The 
only difference is the MC procedure parameters {x" or x"). Our ID and 2D benchmarks show 
that both methods produce identical results. 

2.2. Unweighted and weighted particles 

Unlike the Descartes coordinates, even the grid spacings and the densities are uniform, the 
volumes of the grid cells are proportional to the radius in cylindrical coordinates. In this case, 
even if super particles were assigned to identical number of physical particles and the grids 
were unform with identical macro particle numbers per cell, they would not give constant 
density. 

In general, we can apply two methods, i.e., unweighted particles and weighted particles. 
One can adopt non-uniform grids along R for unweighted particles Il24l . but this algorithm 
should lead to very large grid spacing near the axis which will produce large errors in implicit 
schemes and Poisson solver. One can also use uniform grid spacing and set the particles 
number in one cell proportional to the radial position. However, MC process requires that the 
super particles number in one cell should not be too small. This method will either produce 
small number of the particles near the axis to disturb the MC process or lead to excessive large 
particle numbers in the outer grid cells. So unweighted particles are not recommended here. 

To overcome this problem, one can adopt weighted particles ||9]|27]|46]|47]|48ij. A certain 
weight Wp is assigned to each particle according to its radial position or the volumes of the 
cell: 



Here Wp is the number of the physical particles which is contained by one super particle. 
This method will make identical numbers of the super particles in each cell when the initial 
densities and the grids are uniform. The major problem associated with this scheme is that 
the super particle needs to be properly resized when it moves radially. 

In DSMC simulations. Bird |9| presented zero-order resizing method, where the super 
particle numbers were changed according to their positions but momentums and energies were 
kept. The new weighting factors may be based on either the radius of the cell or the the radius 
of the particle itself. In the particle based weighting scheme, when a super particle moves 
from radial position r to r' in one step, it has the possibility to be discarded or duplicated 
according to a certain probability to ensure average charge conservation. Cell based weighting 
scheme is similar, except that the particle is discarded or duplicated when it crosses the cell 
boundaries. Cell based weighting scheme is introduced by Nanbu [21 \ from DSMC model, 
and is successfully used for inductive coupled plasma simulations |28|. However, cell based 
weighting is not recommended [9 1, because it will lead to errors when particles moves parallel 
to the axis, and it can not maintain the smooth flow gradients normal to the axis. 



Wp oc Inrhrl^z 



(3) 
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In these resizing processes the charges only conserve on average, which would produce 
the random walks problem. This problem has been well known for long in DSMC simulations 
191. Random walks is not a major problem in DSMC simulations, because the molecular 
interactions are short range force and there are no interactions in the free flight phase. In 
PIC/MC simulations, the dominating electric force is a long range force and the collective 
electron oscillation in radial direction exists. Then the fluctuation of electron density would 
bring electrostatic waves and nonphysical eff'ects. 

It would be more accurate that the super particle number is conserved, i.e., particle 
weight will not be changed when it moves. Some PIC codes |46]|47]|48l adopted this scheme. 
DSMC method also adopted similar treatment, and one often referred to this scheme as 
stochastic weighted method |49|. However, the small weight particle may be replaced by the 
larger weight particle. After the simulation runs some rf periods, we find that the super particle 
numbers near the axis tend to proportional to the radius. There would be only lO's particles 
with large weights near the axis after running some periods. This effect is more obvious for 
electrons, which will bring nonphysical heating in the axis. After the simulation runs some 
periods, the average weights of the particles become larger. So we adopted a special particle 
split scheme. After the code runs some periods, the weights of the particles Wp are checked 
and compared with the weights Wr calculated from their present radial position. If Wp > w,-, 
the particles are split to several new particles (A^ = [wp/wr])- The positions, velocities and 
accelerations of the new particles are duplicated from the old ones. The weights of the new 
and old particles are set to Wr and Wp - Nwr respectively. Here all phase space information is 
kept and the charge is conserved. After the system reaches equilibrium, this splitting method 
will only change the numbers of the super particles, but not change the plasma density and 
the field. This method will increase the super particle numbers, so sometimes one need to 
combine the small particles B6l . 

We have benched all three weighting assigning schemes. When a small number of 
particles is used, especially at small radius, we find that particle based resizing method tend 
to produce larger density than weighting-conserving scheme. With enough large number 
of particle per cell, both methods produced similar results. But for cell based method, 
the radial density is not very smooth and many small peaks appear in the density profile, 
which implies that additional electrostatic modes are excited and thus larger density can be 
generated. Therefore we recommend the weighting-conserving scheme with enough particles 
per cell. When using zero-order resizing scheme, we do not use the global buffer like the 
DSMC method |l9]|50l, but just duplicate the particles. 

2.3. Particle Pushing 

In the first and final pushing steps, the accelerations, velocities and positions of the particles 
are updated. In Descartes coordinate, it is straightforward: every position component should 
be added with the velocity component multiply At while every velocity components should 
be updated accordingly. But in curvilinear coordinate systems, the position components could 
couple together. It is not recommended to push the particles by using the moving equations 
in cylindrical coordinate: using AO - vgAt/r will produce larger AO at small r, then one needs 
adopt smaller Ar and Af at small r. A feasible way is local coordinate transformation. Both 
DSMC method and PIC code ||8] |9l ISTI have adopted this treatment, but here we need some 
modifications. The particle are described by the parameters {r,z,Vr,vg,v^,a,-,a,}. Then the 
first pushing can be applied as follows : 
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v'y = Vfl + -agAt 
x' - X + v'^At 
y' = v;Af 

v' — V. H — a, At 

<2~ 

z' = z + v^Af (4) 
Then the coordinates are rotated, the new velocities and accelerations are given by 
{v^, v'g] - {v[. COS + v', sin 6, -v^ sin + v[. cos 0} 
{a'^., a'g] = {or cos G, -Qr sin 9} 

a'^ - a, (5) 
The final pushing is executed in full X-Y coordinates(/ = r, z): 

x'.' ^x': + --EiAt^ 
' 2 m 

, 1 ^ 
v' = V + - —EiAt 

' 2m 
a,. = -(— £/ + fl,) 

When particles hit on the electrodes,we remove them from the moving particle lists and 
add the charges to the depositing charges of the electrodes. If particles pass the axis, the 
position, velocity and acceleration of the particles in R direction are changed to their absolute 
values. In this scheme we have ignored the ae damping accumulating (equ |[T|). If time steps 
are large, there could be some errors on vg at small radius. However, the dififerences in our 
cases are neglectable. 

The other natural way is to adopt global Descartes (X-Y-Z) coordinate, in which every 
particle has its Descartes coordinates, velocities and accelerations in full three dimensions 
despite that the field is still in two dimensions. The first pushing can be done in Descartes 
coordinate. In the final pushing, we weight the Er with the particle position radius r - 
-^x^ + y2 and calculate the Descartes components of electric field, 

X 

tan6'=- E^-ErCos0 Ey-ErSinO (6) 

y 

Then the final pushing is executed in full X-Y-Z coordinates. If r is very close to zero, we set 
cos 0-1 and sin = in Equation. |6l 

In the particle initialization, we get the particles radius uniformly (r^ - p/N * R, N is 
the total particles number) then Xp and yp are given by 

Xp = rp cos 0, yp - Kp sin 6, (7) 

where 9 is randomly sampled between and 2n. 

We benchmarked above two algorithms and find no obvious differences in the results. 
The local X-Y scheme run slightly faster than the global X-Y-Z scheme. 

2.4. Weighting 

In curvilinear coordinate systems, assigning the particle charge to the grid must be specially 
treated, which has been well studied by Verboncoeur |33, 34j in the most generalized 
form. There are two frequently-used weighting methods in cylindrical coordinates : bilinear 
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weighting in z - r and bilinear weighting in z - r^. In z - r weighting, the real particle number 
Nij assigned to the grid point at (r,, zj) can be written as 

Nij = Wp- -. (8) 

(Zi+i -Zi)(rj+i -rj) 

The particle number N/j assignment in z - weighting is 



]+ 

Here rp — ^xj, + for X-Y-Z coordinates. The density riij is calculated by 



Nu-^P- 7^2 ±- (9) 



(10) 



where Vj = ^7T[rj+i(rj + rj+i) - + rj-i)](z,+i - Z/). and Vo = j7Tr^^(zi+i - z/) for z - r 

weighting; Vj = jMr^j+i - rj_i)izi+i - z,) and Vo = \nr^^{zi+i - z,) for z - weighting. The 
weighting of field Er and acted on the particle are done similarly. Our benchmarks show 
that those different weighting schemes give very similar results. 



2.5. The field solver on the cylindrical coordinate systems 

In electrostatic DIPIC, one of the key issues is to construct a fast and stable Poisson solver. 
The solver should have good scalability when being parallelized. In this problem, it must be 
suitable for the case of 7? » Z to deal with the real-size CCP reactors. 

There were many Poisson solvers that can deal with the variant coefficient Poisson 
equation. Because of the variable dielectric constant e - I + x, the fast Poisson solver (based 
on the Fourier Transform or Cyclic Reduction |52|) can not be applied directly. Historically, 
some researchers have used global iterations to construct the solver [!3T1. These solvers can be 
constructed only by the Fast Poisson Solver, but they have variable convergence rates. When 
the dielectric coefficient^ varies, the iterating times of the solver can increase to unacceptable 
level. Goloub et. al Il53ll transformed ffie Poisson equation to a Helmholtz equation and 
solved it by a similar iteration solver. However, the solver has similar shortcoming. The 
Dynamic Alternating Direction Implicit (DADI) algoriffims 1391 can be applied here but 
showed low efficiency. Typical Krylov iteration methods (Conjugate Gradient or GMRES, etc 
) have similar shortcomings. Recently, some authors use Krylov procedures to deal with the 
Helmholtz equation from Goloub's method [54J. The major shortcomings of these algorithms 
are the complexities. 

In our case, because of the positivity of the the equation is a negative elliptical 
equation. The equation can be discretized by the finite volume scheme|8|. The only 
difference is the susceptibility at half integer point be selected io Xi+il2,j - j(Xij + Xi+ij) 

andxj,j+i/2 = jiXij +Xij+i)^ orxi+i/ij = max(xij,XMj) andxij+i/i = max(xij,Xij+i)- Our 
ID benchmarks didn't show any differences between the two methods. 

Consider the simple finite volume discrete scheme[8J or similar five-point discrete 
scheme, the discrete Poisson equation has the form 

- aij4>i-ij + bi^j(f>ij - Ci+ij4>ij 

-di,j4>u-\ -ei,j<Pi,j+i = h^fij (11) 

In uniform grids, this scheme has two order accuracy. All of the coefficients (a,b,c,d,e) 
are positive and we have bij = aij+Cij+dij+eij for the uniform grids. In addition, because ffie 
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Figure 1. The concepts of the grid z-coarsening 



X depends on the charge densities, thtxiXij) distribution is smooth spatially. So the multigrid 
method |55 , 56 1 is a good choice to construct the solver However, typical etching devices have 
cylindrical shapes and the radius are much larger than the height. In cylindrical systems, the 
standard multigrid solver can cause slow convergence rates: The convergence rates of typical 
Descartes multigrid 2D Poisson solvers are constants and less than 0.1, which means about 
8 - 10 V cycles reduce the error norm to 10 For 64 * 64 r - z cylindrical Poisson systems, 
the iterations increase to about 30 |57|. When the grid numbers on r direction increase, the 
converging speed becomes much lower Additionally, because the typical discharge device 
is large, we need parallelize the code. The parallelization of standard multigrid solvers is 
complex and case dependent. So we developed a semi-coarsening multigrid solver |56|. 

Semi-coarsening multigrid procedure has been applied to the anisotropic problem 
ll58l l59l . If the grid spacing on one direction is much less than the other directions, one 
can coarsen this direction grid only. On the other hand, the grid in our problems is uniform 
but we run the z-coarsening only. The concepts of z-coarsening are showed in Figure[T] After 
one turn coarsening, the grid spacing in z direction is doubled. 

When z-coarsening is applied, the coarsened differential stencil will become very 
anisotropic, so standard Gauss-Seidel smoothing is not effective. To overcome the difficulty, 
we apply a line-smoothing procedure: write the coarsened equations as 



where / = 1 , M along the z-direction and j - l,N along the r-direction. The Une smoothing 
is executed by / = 1 , M. For every i, the equation sets are written to 



-aij(f>i-i,j + bijcpij - Cij4>i^ij - dij(f>ij-i - eij4>ij+i - h fij 



- dij- 1 cf>i j_2 + bij^ 1 4>.j_ 1 - e; j_ 1 (I) .J = 

1 2 r , lOld , A.old 




(12) 
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When the equation set is solved, the /,/, hne smoothing has been executed. In every smoothing 
procedure, the line smoothing is executed from / = 1 to M with Red-Black order and the grid 
values of are updated. 

Numerical tests show that V(0,1) cycles provide good convergence rates. The algorithm 
can be described like follow: 

Algorithm 1. Multigrid V-cycle MG(u,f) 

- Wo 

do until converge 

f =/ 
do Z = L-1,2 
m' = 

/'->=/,'->(/'-Ay) 

enddo 

solve A^u^ - 
do I — 2,L 

u' - u' + l'i_i(u') 

linesmooth(u', /') 

enddo 

loop 

The restrict and prolong operators are simple hne forms: 

/,'_! = [1/4 1/2 l/4f (13) 

We benchmark the solver and find that when x - ^ the solver converges constantly for 
varied size grids with Dirichlet boundary condition. When x ~ ^ with smoothing spatial 
distribution, the solver converges to 10"'° by L°° norm after about 12 times V-cycles. 

Typically, the modeling should consider the external circuit of the reactors. For example, 
dc self-biasing voltage can be built up on the electrically powered electrodes due to the 
blocking capacitor The general way to include these effects is given by Verboncoeur 1 32 1 . The 
surface charges are included in the Poisson equation, coupled with the external circuit. In our 
cases, since the coupling between the electrodes is not strong, we have adopted the Vahedi's 
model 137] , which is the simplified version of Verboncoeur' method and can be numerically 
realized easily. In this model, the electric fields with external circuit can be obtained by linear 
superposition of the solutions to two problems: At first, the Poisson equation with exact charge 
density and a zero boundary condition is solved: 

V ■ (x'^(f>p) = -p 

<pp\electrode — 

Secondly, a Laplace equation with same coefficient and normalized boundary condition 
is solved. For example, if the upper electrode is grounded and RF power is applied on the 
lower electrode, the Laplace problem will be 

V • (x^H) = 

tpdupper - 0, (pdlower - 1 
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Finally, the voltage V' of the powered electrodes are calculated by solving the circuit 
equation and get the field solution. For example, when a capacitor is applied between the RF 
source and the powered electrode, the circuit equation becomes: 



where Q[g„y is the charge deposited on the RF electrode from f - 1 to t, Q and Qc are the 
charges deposited on the electrode and the capacitor, while V'^j is the instantaneous rf source 
voltage. However, because the solver is run with the i, v, the Qconv in the real solvers is 
calculated from this step's first pushing and the previous step's final pushing. Because the 
self-biasing is a slow varying process, this causes no observable errors. Then we get the ^ by 
(p - <pp + V'(pL- We need to solve more Laplace equations for the extra RF-applied electrodes. 

One can also include the external circuit eff'ects by physics insights l,14i ,15 1. This method 
is to force the net current flowing into the electrode to zero in one rf period. One can adjust 
Vdc within the voltage waveform of V = Vyf sin cjt + V^c to satisfying this condition. This 
method is very easy to be incorporated, but it is not recommended here because of its narrow 
applicability. 

2.6. Monte Carlo model 

Since the Monte Carlo model is a well-developed method, we only briefly discuss the 
numerical treatments. At present, our codes only include gas models for Ar, O2 and CF4. 
For electrons, we adopted the null colUsion method and the molecular velocity is assumed 
to be zero since » v„. For the non-reactive collisions between ion and neutral gas, null 
collision method is still adopted, and the molecular velocity is sampled from Maxwellian 
distribution. The velocities of the electrons and the ions after non-reactive collisions are given 
by Vahedi' method 1411 . The velocity of the ion after a reactive collision is given by Nanbu 
and Denpoh's method Il23l l25l . The Ar cross sections come from |60| while the O2 cross 
sections come from f^T), and the CF4 cross sections come from the BOLSIG package |61|. 
All of them are linearly interpolated. For the energy being higher than the available data, we 
extrapolated the cross sections by the IjE law ll62l . 

The standard sampling procedure of the null collision method is still adopted here, 
regardless the different weights of the particles. We have also proposed weighting-based 
sampling procedure for the null collision method. Our ID benchmarks of both methods have 
showed that, weighting-based sampling procedure is more accurate, especially when larger 
density gradient exists. However, there is only up to 10% difference between the two sampling 
procedures, and the weighting-based sampling procedure runs slower and has some problems 
in 2D problems. So we still use the standard sampling procedure in our present research, and 
we will discuss the new method elsewhere. 

In the Monte Carlo processes in which the new particles are generated, the weights of 
the new particle is just duplicated from the incident particles. The accelerations a of the new 
particles must be set to a reasonable value. We set the accelerations according to the charge 
mass ratio; 



Q' = Q'-' + C[V' - y'] - Q';' + Q[ 



(14) 



(15) 



a,- - aj 



qjmi 



2.7. Speeding up and parallelization 



There are many ways to speed up the code lfTSl l63l l64l . We have adopted subcy cling in 
our code with Vahedi's method ll35l . and speed boost nearly 2 is achieved for ID cases. 
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Particle sorting, which is very successful in our explicit code IfTSll . shows no speeding up 
and sometimes it is even more slow than the unsorted cases. The reason is partially because 
the particle sorting is a time consuming operation, and partially because the grid sizes are 
much smaller than the explicit simulation and thus the cache missing is not very serious. 

On the other hand, although implicit algorithm can be executed by much smaller grid 
numbers than explicit methods, the simulation particles number is still very large. So we 
need parallelize the code ifTSl . Here we adopt the MPl_ALLReduce framework ll63l [TSl : 
X, p and Qconv are summed up to each processor, the serial Poisson solver are executed 
on every processor to calculate the fields. Since the simulation size is not very large here, 
the communication time and Poisson equation solving time are very little. So the parallel 
efficiencies are satisfying. 

3. Benchmarks and results 

All simulations are carried out on our 12 nodes PC clusters: each node has an Intel Core2 
E4500 CPU and 2G memory. The nodes are connected by lOOOM ethernet networks. The 
clusters have about 210G FLOPS Rpeak and about HOG FLOPS Unpack Rmax- We normally 
run two processes in one CPU for 2D parallel simulations. 

The physical parameters of the benchmark problems are listed as follows: The frequency 
of rf source airf - 27rl3.56MHz. Voltage source is applied to ffie electrode at z = 0cm with 
the waveform of Vrf = 200 sin w,/?. Argon gas is used with the pressure of lOOmTorr and 
the temperature of 300K. The electrodes spacing is 2cm while the radius is 8cm and the gap 
between the lower power electrode to the grounded outer cylinder is 2cm. Here we do not 
consider the self-biasing effect. The initial density is uniform of 5 x lO'^m"^ for all cells and 
200 particles are placed randomly within one cell. All simulations are run for 1000 rf periods, 
and all results below are given by averaging one rf period. 

3.1. ID results 

The ID simulations are performed to benchmark the implicit results with the explicit 
simulations and to determine the space and time steps. The ID simulations are run serially 
in one node and will only take about several ten minutes for implicit code (with 64 cells 
and Af = 0.5 x 10"'*'i ) and about several ten hours for explicit code (512 cells and 
At= 1.25 X 10"",?). 

Firstly we compare the results of implicit and explicit algorithms. Fig. |2^ shows the 
average electron density profiles with different spacing and time steps. For the implicit 
numerical schemes, we have adopted ;(f,+i/2,j = jOdj + Xi+ij) a^d^jj+i/i = jiXij + Xu+i)- 
From Fig. |2^, we can clearly see that implicit schemes can give reasonable densities. 
However, the densities from the implicit schemes are lower than those from the explicit code. 
Larger space and time steps tend to give smaller center density and smaller bulk plasma length 
due to the excessive damping of the high frequency modes. We also find that the damping error 
is more sensitive to the time steps than the space steps. This is beneficial to the simulations 
because the computational cost is inversely proportional to the square of the grid size and is 
only proportional to time steps. This would allow us to use larger space step and smaller time 
steps while keeping the accuracy. It seems that A^, = 64 and Af - 0.5 x 10"'"i is sufficient to 
ffie simulations, because ffie bulk plasma length is nearly identical to ffiat of ffie explicit one 
and the plasma density is only about 20% lower We considered this difference is acceptable, 
because most physics involved here is still kept. One can also adopt finer space and time 
spacing to reach better accuracy but cost more running time. 
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Z (cm) Z (cm) 

Figure 2. Simulation results with different space and time steps (the former number is the 
number of the grid and the later one is the time steps xlO""i): (a) average electron density 
profiles ; (b) time averaging potential. 

Fig. I2J3 shows time averaging potentials from the same parameters. There are little 
differences between different space and time steps, and potential of the explicit scheme is 
slightly lower than that of the implicit one. It can be seen that over a wide range, the implicit 
code is stable. 

3.2. 2D results 

The 2D simulations of the benchmark problem are run paralleled in 4 or 8 nodes. Square cells 
are used and z direction is uniformly divided into 64 cells. The space and time steps are fixed 
to Ax - 0.02/64mand Af(. = Af, = 0.5xlO""^s. The numerical schemes are chosen as follows: 
(1) particles are moving in X-Y-Z coordinates; (2) weighting charge density and interpolating 
the field in z - scheme; (3) Xi+ijij = max{xi,j,XM,j) and Xi,j+\li = max{xi,j,Xi.i+\)'^ 
(4)the potential is logarithm interpolated at the gap between the lower electrode and the outer 
cylinder; (5) charge conservation scheme is used; and (6) voltage source is directly applied 
to the electrode and no external circuit is considered. During most time of the simulations, 
totally 3 - 8 X 10^ super particles per species are traced. The simulation will take 30 to 90 
hours for one simulation in 4 nodes depending on the specific numerical schemes. 

The time averaging electron and ion density over one period are shown Figl3] The 
amplitude and profiles are consistent with the optical emission tomography results [65 1, the 
fluid simulations Il66l and the explicit simulation results IfTSl . The density distribution along 
Z is very similar to the ID results. There are two densities peaks existing along R. One is near 
the axis and the other is near the gap between the electrodes and the outer cylinder, formed a 
saddle-like profile. This profile has also been observed in experiments 1651 . There are some 
noises in the axis of the ion density. 

The time averaging potential C), and Er are illustrated in Fig. |4] It shows that O and 
E^ have the profiles along Z similar to the ID results except for in the region near the gap. The 
E,- is smaller than E^ and only obvious at large radius, because there is no rf voltage applied 
and a radial sheath is formed. 

4. Discussion and summary 

In the present work, we have developed an implicit and electrostatic Particle-in-ceU/Monte 
Carlo model in two-dimensional axisymmetric geometry. We discussed the available 
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Figure 3. 2D average electron(a) and ion(b) density profiles. 




Figure 4. 2D average (a.)<S>, (b)E^ and (c)^,. profiles. 

algorithms in the cylindrical implicit simulations in detail. Benefits and shortcoming 
of several possible algorithms were analyzed and compared to select the most reliable 
technologies. ID and 2D benchmarks were executed to validate the code and show the 
possible errors of the algorithms. Although our code can be used to study most practical 
CCP devices and the results seem satisfying, some issues still need to be addressed. 

Although our MC code has included the model for O2 and CF4, simulation for these 
electronegative gas is only possible for ID case, for the electrons and ions are equally 
weighted in our present model. One should adopt species-depending weight scheme|25). 
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but we have not worked out all the details for the MC process and the resizing method in this 
complex case. 

Another problem should be mentioned is the smoothing. Filtering or smoothing |[8l [34ll 
for the summed-up charge density can be used before solving the Poisson equations. In 
Z direction, binomial filter can be applied, and volume corrected filters can be used in R 
direction. However, we find conventional filters have no significant effects, even up to 200 
pass being used for both p and x- We believed the implicit method has significantly damped 
the high frequency mode, so the smoothing should be different from the implicit model. 

The density in the axis shows 10% ~ 30% difference from the adjacent line of grids. This 
phenomenon is also observed in DSMC simulations |9 50 1 and is attributed to the numerical 
diffusion effects. Furthermore, the density in the axis is very noisy. Although it seems that this 
phenomenon has little effects on the final results. We are now trying to solve this problem. 

It seems the Dl implicit scheme is over damped, and thus produces smaller density than 
explicit scheme in some cases. Our ID benchmarks have shown that adjustable damping 
schemes f40| could give better results with same space and time steps. 

When incorporating external static magnetic field into this model, this model can be used 
for many other similar devices, such as magnetrons and Hall thrusters. The major differences 
are the particle moving and the field solver, which have been studied before [31 40 1. We are 
now trying to improve the model by addressing above issues, and also adding more gas model 
into our code. 
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